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ABSTRACT 


We  consider  the  sequential  detection  of  a Markov  sequence 
in  a linear  system  with  interrupted  observations,  i.e.,  systems 
with  a switching  environment.  Because  of  the  excessive  compu- 
tational requirements  for  optimum  procedures,  three  suboptimum 
filters  are  discussed,  all  of  which  feed  into  a sequential  like- 
lihood-ratio detector.  The  results  of  a computer  simulation  are 
also  presented. 


2- 


» 


wmm 


I.  introduction 

It  is  common  to  assume  linear  models  for  the  state  and  ob- 
servation when  formulating  dynamical  system  problems.  The 
resulting  equations  are  simple  to  analyze  and,  provided  suitable 
criteria  are  chosen  for  optimization,  yield  attractive  solutions 
such  as  the  well-known  Kalman  filter.  Implicit,  however,  is 
the  assumption  that  the  origin  of  observations  is  known,  which 
may  not  be  true  in  practice.  In  this  case  the  structure  of 
the  optimal  solution  may  change  completely.  Such  a practical 
aspect  of  the  model  and  its  consequences  i*  problem  analysis  has 
been  addressed  recently  (as  in  [lJ-[7]).  Here,  we  present 
results  of  a preliminary  investigation  in  a related  problem 
area. 

The  particular  problem  analyzed  in  this  report  is  characterized 
by  an  observation  sequence  whose  noise  switches  in  a Markovian 
manner.  Such  a problem  arises  in  multi-target  tracking,  [7],  and 
was  first  treated  by  Ackerson  and  Fu  who  derived  the  Bayesian  op- 
timal estimator  of  the  state,  (5] . In  this  investigation,  we  focus 
on  the  detection  part  of  the  problem  and  present  a sequential 
Bayesian  optimal  detector  for  the  switching  sequence,  denoted  by 

The  distinction  between  the  present  problem  and  conventional 
detection  problems  should  be  made  clear:  here  the  sequence  Iv^) 
changes  according  to  a Markovian  distribution  and  hence  the  true 
hypothesis  switches  from  one  stage  to  another.  Therefore,  tech- 
niques derived  for  problems  with  linear  models  - such  as  in  [B]  - 
are  not  applicable  here,  since  they  assume  the  complete  observa- 
tion sequence  belongs  to  either  or  H^.  Furthermore,  se- 
quential detection  procedures  - see  for  example  [9]  - implicitly 


» 
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make  the  above  aaaumption  and  defer  decision  on  HQ  or  until 
time  k+1,  if  the  sequence  of  observation  up  to  time  k is  not 
* informative  enough. 

The  report  begins  with  a statement  of  the  problem  in  Sec. 

II  and  then  proceeds  to  derive  the  Bayesian  optimal  detector  in 
1 Sec.  III.  The  algorithm  obtained  has  the  nice  property  of  being 

recursive;  however  it  requires  numerical  integration  of  p.d.f.'s 
and  hence  is  not  practical.  Therefore,  we  also  derive  three  dif- 
ferent suboptima 1 schemes  in  Sec.  IV  and  give  the  corresponding 
detection  algorithms.  The  results  of  a simulation  are  described 
in  Sec.  V and  are  followed  by  some  observations  and  a comparison 
of  the  procedures  in  Sec.  VI.  In  Sec.  VII  we  make  several  con- 
clusions and  discuss  possible  extensions  of  this  study. 

» 

I 


» 
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We  are  given  a discrete-time  linear  system  in  which  the 
measurement  noise  has  a Markov  dependent  statistical  property. 

It  is  described  by  the  following  equations < 

xk  ■ ‘k.k-lxk-l  + ak-l“k-l  <i; 

*k  “ Vk  + vk  + Ykwk  (21 

where  and  are  vectors  of  dimension  nxl  and  rxl  while 

$k  and  GJc_1  are  matrices  of  the  appropriate  dimension.  We 
assume  that  the  initial  state  x0  is  normally  distributed  and  the 
sequence  {uk)  is  white  Gaussian  with  zero  mean.  Thus: 

x0  ~ "'•K-V  ' <31 

uk  ~ N(.  |0,Vu(k))  , E(UjU£)  -vu<k)6j,k  • (41 

The  state  vector  enters  the  measurement  equation,  (2),  linearly 
and  is  corrupted  by  v^  or  v^  + w^  depending  on  whether  Yk  is  0 or 
1,  respectively.  The  vectors  zk»vk  and  w^  are  all  of  dimension 
mxl  and  is  an  mxn  matrix.  Again  we  assume  {v^}  and  to  be 

white  noise  sequences  with  the  following  statistics: 

uk  - N(.|  0,Vv(k))  , (5) 


wk  ~ N(.  |0,Vw(k))  . 


The  sequence  £ Vk)  is  a binary  Markov  chain  defined  on  the  state 
space  [ 0, l)  and  is  statistically  described  by  an  initial  proba- 
bility vector  (l-pn#p0)T  and  a transition  probability  matrix 

n-a  a 1 


-a  a * 
P 1-P. 


P 


(7) 


In  the  above  it  is  assumed  that  the  vector  xQ  and  the  sequences 
{uj'},  {v^}  and  (w^)  as  well  as  {vk)  are  all  mutually  independent. 

Our  problem  is  to  decide  - at  each  stage  k - on  the  value  of 
Yk  minimizing  the  probability  of  error  in  detection.  Formally, 

minimize  Pr{ ?k  j*  Ykl *0» . . . #zk) 

Yk 

a a (8 

where  Yk  - Yk  (z0»  • • • ,zk) 

zk'Xk  8ati8fy  Eqs*  ^ and  with  the 

underlying  statistics  given  by  Eqs.  (3)- 
(7). 

The  derivation  of  the  detector  is  described  in  the  following 


section. 
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XII.  The  Bavesian  Optimal  Deta ctor 

The  minimum  probability  of  error  problem  of  Sec.  II,  is 
equivalent  to  a Bayesian  decision  problem  in  which  we  minimise 
the  Bayes*  risk  for  a special  choice  of  the  coat  matrix  (see 
for  example  [10)).  Therefore,  the  problem  under  consideration 
reduces  to  testing  - at  each  time  k - the  hypotheses* 

V \ - Vk  * vk 

H2 1 *k  ■ Vk  + vk  + wk 

where  x^  evolves  according  to  Eq.  (1)  and  the  true  hypothesis 
switches  from  one  stage  to  another  according  to  the  transition 
probability  matrix  P of  Eq.  (7). 

By  using  Eq.  (12),  Ch.  2 of  [10),  the  Bayesian  optimal  rule 
can  be  written  as1: 

v A *<*k|vv-l> 

V > * -_k  k " 


(9) 


f (x*|vk-o) 

^2  prior  probability  is  true 
prior  probability  H2  is  true 


(10) 


Consequently,  the  problem  is  basically  that  of  evaluating  the 
quantities  appearing  in  Eq.  (10).  We  begin  with  the  densities 


f(x  Iy^)  and  apply  Bayes*  rule  to  get: 

k-1 


f(zK|Yk)  " f z*~1,Yk>f  (rk_1|Yk) 
- f(*kl*  ,Yk)  p(  ) 


^For  convenience  we  use  the  notation  ^ (s,,...,Ej() 
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- X [f(x’'"1|Yk.Yk.l  - 0)p(YklYk  l - OjPrtY^ 
* f('k  llylt'v)t-l  * 1,p(,'kl',k-l  " l>Pr^vk-l  " 1^,// 

tP^lYK.i  * OlPrtY^!  - 0)  ♦ P<YklYk.1  - l)Pr(Yk_1  ■ l)l 

- f (zkl  xlt"1.Yk)  x f (xlt'l|  Yij.Yi,.!  ■ o) 


t (x1*-1!  Yjt.Y, 
f (x1c-1Iy^.y. 


1)  P(Yklvk.!  - 1)  PrtY^t  - 1) 

o>  ptVfclVfc.!  - Q)  pHvk.1  ■ 0) 
P(Yklvk-l  ‘ 11  PrtYlc-l  • 11 
p(Vklvk-l  " 01  Prtvk-1  ' 0) 


. (11) 


Noting  that  f (zk-1|  Yk»Yk_j_)  ■ f (zk_l|  Yk-1 ) by  the  Markov  property 
and  defining  ^ f(zk~*|Yk  ^ * l)/f  (zk~*|  Yk  ^ * 0),  we 

obtain  upon  substitution  from  (11)  into  (10): 


Vzk) 


f (zj  Zk_1,  V,  = 1) 


f(zk|zk"1,Yk  - 0) 


p mill  Pr[Y*-i 

P(0|0?  prtvk_i 

pglii 

P(1|0)  Pr{ Yk_1 


Eq.  (12)  suggests  that  we  can  carry  out  our  detection  scheme  in  a 

)C“1 

sequential  manner  by  using  LJt_^  (z  ) and  Pr{yk_1  * i),  for  i ■ 0,1, 

obtained  at  stage  (k-1),  and  by  computing  the  density  of  zk  con- 
k-1 

ditioned  on  the  observations  z under  each  hypothesis.  Another 
implication  of  Eq.  (12)  is  that  it  reduces  to  Scharf's  and  Nolte's 
result,  [8],  when  we  set  P " [o  l]  as  th®V  assumed.  The  likelihood 
ratio  test  can  now  be  expressed  explicitly,  by  substituting  from 


t 
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t 


» 


t 


t 


Eq.  (12)  Into  Eq.  (10)  thus  yielding: 

V'k>  - 


f<Vz 


k-] 


f (!„!  I 


k-1 


■0) 


r,  -r 

1)  Prfvk-r1Jl 

L1+Lk-1 (z  )plil 

lot  Prlvk”i-0J 

1)  Prtvk-l“13 

1+Lk-1  (z  ,ptol 

fo)  PrtYk_l“0) 

”2  PrlYk’ 

< pr[v  ■ 

H ^rlYk 


•0) 

*1 


(10-a) 


An  Algorithm  for  the  Sequential  Detector 

We  now  proceed  to  evaluate  the  quantities  appearing  in  Eq. 
(10-a),  starting  with  the  R.H.S.  Using  the  law  of  total  proba- 
bility and  the  Markov  property,  p(yk)  may  be  expressed  as  follows 

p(Yk)  - p (Ykl  Yk_1=0)Pr{Yk_1=0)  + P <Ykl  Yk_1*l)Pr(Yk_1=l)  • (13) 

k-1 

Next  we  consider  the  L.H.S.  of  Eq.  (10-a).  Both  (2  ) and 

p(Yk_^)  are  assumed  to  be  computed  at  stage  (k-1).  The  density 
k—  1 

f (ZjJ  z »Yk)  can  be  evaluated  using  the  law  of  total  probability 
which  gives: 

f (*kl  zk_1'vk)  = dxkf  (xJ  zk_1,Yk)f  (z)J  xk'zk"1,Yk) 

Rn 

= J*  dxRf  (xk|  zk_1,Yk)N(zk|  I^Xj^.Vtk)  ) . (14) 

Rn 

Here  we  made  use  of  Eq.  (2)  and  the  term  V(k)  is  either  Vy(k)  or 
Vv(k)  + V^(k)  depending  on  whether  Yk  * 0 or  1,  respectively.  If 
we  then  use  the  Markov  property,  Eq.  (1),  and  apply  the  law  of 
total  probability,  ftx^Jz  A,Yk)  can  be  expressed  as: 


I 
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, v_i  , v.i  P(YV|VV  . "0)Pr(  Yv  «"0|  z -1) 

f(xk|zK  1.Vk)  ■ f(xk|*X  ^ ^ 

**  (vx  i vr1  > ?rt  'W1 1 *k' l) 


+ ffx^li  > vJt-X—1^  * 1 


P(vkl''V-l'1,Prfyk-l-llI  '1 


lie?  ‘’H1  Yk-l'i  1 PrtYk-l'i  I 

0 (15) 


where 


(x)Jz  ,vk-l“l)  " ^dxk-lf(xk'xk-l,z  1,vk-l“l)*  f (xk-l*  ^ l,vk-l=i) 


" ^ ndXk-lN  (xk^k,k-lXk-l,Gk-lVu  (k-1)Gk-l 
Rn 

. f (zk-i|xk-i,vk-ri)f  (xk-i|zk  2>vk-ri} 

f(zk-iizk'2*Yk-i"i) 


Clearly,  the  quantities  appearing  in  Eqs.  (15)  and  (16)  are  either 
Gaussian  p.d.f.’s  or  are  available  from  stage  (k-1),  and  thus 
f(xk|z  »Yk)  can  ca^-cu^ate<^  recursively.  It  remains  to  compute 
p(yJz  ) which  shall  be  needed  for  the  next  stage  (k+1).  Applying 
Bayes*  rule  we  get, 

, , k,  f <2klzk"1»Ylc)p(vv|zk“1) 

p(YkU  ) - “ 1 “ “ (17) 

I numerator 
Yk"° 

and  the  law  of  total  probability  then  gives  us: 

P<Yk|zk-1)-p(vklYk_1«0)Pr{vk_1-0|zk"1}+p(vklYk_1»l)Pr{Yk_1-l|2k”1)  . 


The  equations  just  derived  constitute  the  basis  for  an 
algorithm  that  detects  Yk  sequentially.  Formally,  it  is  given  by; 


Algorithm  0 


step  1 Start  with 
f (xn|  2 *Yn)  “ * {xn)  “ N^X0^0,V0^ 


f (zo|z’1<yo)  ■ f(zo|,'o)  - H<zolHoVHoVS  * Vv(0)  + yoV0,) 


n f (z.|vn-l) 

L0(Z  > - f(r®|Vo-0> 

Pr{ Y0=l)  - pQ  - Pr{Y0“l|z°} 

step  2 Assume  we  are  at  stage  k.  Then  use  Eqs.  (16),  (15)  and 

(14)  to  calculate  f (x^l  z^""1,yJc_1)  * f (x^l  z^-1^^)  and 
)c—l 

f (ZjJ  z ,y^)  * respectively.  Also  compute  pW^)  using 
Eq.  (13). 

step  3 Calculate  (z  ) as  well  as  decide  on  y^  using  the  test 
(10-a) . 


step  4 Determine  p (y^|  z ) from  Eqs.  (18)  and  (17)  and  store  for 
the  next  stage  together  with  the  already  computed  values 


of  f (XjJ  zk”1,Yk) , f (zkl  zK_x,Yk)  ,p  (Yk)  and  1^.  (zK) 


k-1 


step  5 Set  k = k+1  and  go  to  step  2. 


The  above  algorithm  is,  in  principle,  straightforward?  how- 
ever, its  implementation  is  not  so  simple.  This  stems  from  the 
fact  that  Eqs.  (15)  and  (14)  call  for  the  computation  of  p.d.f.’s 
f(xjjz  i “ 0,1  and  carrying  out  numerical  integration. 

Such  a computation  is  prohibitive,  especially  for  systems  of  di- 
mension greater  than  1,  as  pointed  out  by  Jaffer  and  Gupta  in  the 
context  of  a similar  problem,  [3]. 


One  approach  to  alleviate  this  problem  is  to  use  a decomposition 

1c 

as  Ackerson  and  Fu  did  in  [5].  There,  they  expressed  f (xjJ  z ) as 

a weighted  sum  of  Gaussian  p.d.f.'s;  each  density  corresponding  to 

a particular  realization  of  the  switching  sequence 

Tr  * (Y0» ....Yj Yk)  with  Yj  = 0 or  1.  They  then  used  a bank 

k+1 

of  2 Kalman  filters  to  obtain  the  means  and  variances  associated 
with  each  sequence  and  also  derived  expressions  for  the  weights, 
p(TjJzk).  Though  a similar  decomposition  can  be  used  for 
f (XjJ  z 'Yfc)*  such  an  approach  is  not  practical  because  the  num- 
ber of  terms  involved  grows  exponentially. 

A closer  inspection  of  the  detection  relations  shows  that 
the  source  of  difficulty  lies  in  f (xjJ  z 1 , Y k ) being  non-Gaussian. 

In  contrast,  if  it  were  Gaussian,  then  Eq.  (14)  would  imply  that 
f(zjjz  ~ ^Y^)  is  also  Gaussian  and  we  need  only  compute  the  means 
and  variances.  In  other  words,  we  could  then  use  a Kalman  filter 
to  provide  us  ^ith  the  needed  parameters.  This  simplification 
associated  with  Gaussian  p.d.f.'s  has  been  exploited  before  and 
we  shall  utilize  it  in  the  sub-optimal  procedures  to  be  discussed 
shortly. 

Specifically,  we  shall  first  write  f(x^|z  as 

f (xkl  zk_1,Yk)  ■ J dxk-lf(xklxk  l*f(xk  llzk~1,Yk*  ' and  *19) 

Rn 

then  proceed  in  2 steps : 

(i)  The  functional  form  of  f (x^_1 1 zk_1,Y^)  is  approximated  by 
N (•  J (i , V) , 

(ii)  The  values  for  jj,V  are  NOT  chosen  as  the  actual  mean  and 

variance,  ECxk_ilz  and  Var  (Xj^jJ  z 'Y^)#  but  rather 

as  the  estimate  for  x^^  given  the  measurement 


» 


-12- 
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k— 1 

a - (*0» . ..  *nd  the  correaponding  variance  and, 

hence,  will  depend  on  the  eatimation  method  uaed. 

W*  ahall  inveatigate  3 filtering  procedurea,  in  the  next  aection, 
namely i 

(A)  Deciaion-directed  filtering, 

(B)  Linear  leaat-mean-aquared  error  filtering, 

(C)  Mean-aquared  error  nonlinear  filtering. 

For  each  acheme,  the  filter  equationa  will  be  derived,  and  the 
ccrreaponding  algorithm  for  aequential  detection  will  be  deacribed. 
We  then  preaent  the  reaulta  of  a Monte  Carlo  aimulation  performed 
uaing  the  three  detectora  in  Sec.  V. 
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IV.  Derivation  of  the  Filtering  Equations  and  the  Corresponding 
Detection  Procedures 

A.  Peciaion-Directed  Filtering 

This  filtering  scheme  assumes  that  the  decision  we  make  about 

at  the  kth  stage,  y^#  ia  correct.  In  other  words,  we  make  the 

assumption  that 

^k-1  ^ (y0»  •••»Yj# 

T 'Yq*  • • • # V j # • • • » Yk_j_ ' 

A * 

“ Fk-1 
lc-  X 

and  hence  the  p.d.f.  f(xJt_1|z  ,y^)  can  be  approximated  as  follows 
f (xk— 1 1 a1C”1,Yk^  T f ^xk-l^  *IC”1»r*k-l'vk^ 

* f ^xk-l^  z »^k-l^ 


(20) 


i k— 1 a 

Since  f(x}c_1|z  ) is  the  p.d.f.  corresponding  to  a particular 

realization  of  it  is  in  fact  a Gaussian  density  of  the  form 

N('1xk-l|k-l  * V^(k-l)).  Consequently,  the  familiar  Kalman  fil- 
ter may  be  used  to  obtain  the  mean  and  variance  as  shown  below t 

xk| k 0k,k-lx  k-l|k-l  K 0t)  (*k  “ Hk^k,k-lxk-l| k-1 


V(1,(k)  - [I  - K(1)  (k)Hj]V(1)  (k|k-l) 


(21) 

(22) 


where 


K(1)(k)  - V(1)  (k|k-l)H^tHkV(1)  (k|k-l)Hj  + Vv(k)  ♦ Yltvw  (k)j”1  (23) 


V(1)  (k|  k-1 ) - 


Vk-lV U ’ <k-1  k-l%-lVu  (k-1  lGk-l  • v(l)(0'  - V0 

(24) 


-14- 


The  resulting  algorithm  for  the  sequential  detection  of  can 


now  be  stated  as  follows! 


step  1 Start  with 


f (x0U"  ,y0)  - f(xQ)  - N(x0|u0.V0) 


f(z0|z"  »V0>  “ *<«0lv0>  " N(Z0IV0'H0V0HJ  + Vv(0)  + V0Vw(0)) 


L ,,0, . 

0 f (zOlvo“0) 

Pr{y0-1)  - pQ  - Pr{y0  - l|z°) 

step  2 Assume  we  are  at  stage  k.  Then  use  Eqs . (20),  (19)  and 

(14)  to  compute  f ^ (xk-1|  zk-1,Yk) . f ^ (XjJ  z*-1^)  and 

f ^ (ZjJ  z^  ^ , y^) , respectively.  Also,  determine  p(y^) 

from  Eq.  (13).  (By  the  Gaussian  assumption,  the  condi- 
tional densities  above  are  Gaussian  and  hence  are  com- 
pletely specified  by  their  means  and  variances.) 

(1 ) k 

step  3 Compute  1^  (z  ) as  well  as  decide  on  y^  using  the  test 
(10-a) . 

g.tCP  4 Compute  x,^  and  V(1)  (k)  from  Eqs.  (21-24),  and  store 
for  later  use  in  the  next  stage. 
step  5 Set  k - k+1  and  go  to  step  2. 

We  note  that  an  important  difference  between  the  above  algorithm 
and  Algorithm  0 is  in  step  4,  where  the  detector  output  at  stage 
k determines  the  estimator  structure  at  the  same  stage. 


Here  we  use  least  mean  squares  theory,  together  with  a linear 


constraint,  to  compute  the  mean  and  variance  appearing  in  the 

)C“1 

Gaussian  approximation  of  f (xk_ jJ  z , Thus* 
f(xk-ll'k  1,vk)  • N(xk-llxk-l|k-l  ' y( 

* f (2>  (xk-llzk”1#vk))  • (25) 


where  x 


(2) 

k|k 


satisfies  the  relation 


4\l  - - P2(k,I> 


(26) 


and  F1#F2  are  chosen  to  minimize  Ex# y{  (xk-x^2^)TQ  ^xk“xkfk^  ^ * 

We  will  show  that  and  F2  are  exactly  those  matrices  appearing 
in  the  Kalman  filter  with  the  appropriate  modification.  To  do  so, 
we  rewrite  the  system  equations  as  follows: 


xk  " *k,k-rlxk-l  + Gk-luk-l 


“ Vk 


(1) 

(2-a) 


where 


vk  + Ykwk  an^  underlVin9  statistics  are  given  by: 

xo~N(*K'V  (3) 

uk  ~ N(.|0,Vu(k))  , E{Uju£)  - Vu(k)6jJc  (4) 


\~PrC  vk-0 ) • N ( . | 0 , Vy  (k  ))+Pr { Yk-1 ) • N ( • | 0 , Vy  Oc ) +Vw  (k ) ) , E{  T! TiJ) -V^  (k ) 6 j ^ 

(5-a) 

Clearly,  the  above  system  is  in  the  framework  of  the  well-known 
Kalman  filter  [11],  and  therefore  has  the  following  solution 

x(?)  - a v(2)  . -Ra  v(2).  1 

k|k  ®k,k-lx  k-l|k-l  * K (k)  l*k  ^k.k-l^-ljk-l1 

(27) 

Vt2)0c)  - II  - K(2)  (k)Hk)V(2)  (k|k-l)  (28) 


where 


K(2)<k)  - V(2)  (k|k-l)H^[HkV(2)  (k|k-l)H£  + Vy(k)  + Pr{vk“l)Vw(k)r1 

(29) 

V(J>(KU-1)  - *k,k.1V(J,tlO«^k.1  + Gk-lVu(k'1,Gk-l  ' v(2)(0»  - vo 

(30) 


Observe  that  the  structure  of  the  linear  LMSE  estimator  for  is 
independent  of  the  higher-order  statistics  for  (y^J*  These  statis- 
tics, however,  enter  our  detection  procedure  through  the  expression 
of  the  likelihood  ratio,  as  given  by  Eq.(lO-a). 

We  may  now  describe  the  sequential  scheme  for  the  detection 
of  Yfc  “ using  the  Gaussian  approximation  and  linear  LMSE  filter  - 
by  the  following  algorithm: 


Algorithm  2 


step  1 Start  with 

f(x0Wl,V0)  - f(x0)  - N(x0|„0,V0) 


f (i0l*‘l.Vo>  - f<*0lv0>  - K(s0|  V0,H0v0Hj  + Vv(0)  + y0v„(0)) 


,0,  . f(ln|Y0*1’ 

0 Z ' f(X(,Uo-0) 


step  2 Assume  we  are  at  stage  k.  Then  use  Eqs.  (25),  (19)  and 
(14)  to  compute  f ^ (xk_jJ  zk_1,Yk) » f ^ (xk|  zk-1,Yk)  and 
f ^ (zk|  zIc"^‘,Yk) , respectively.  Also  determine  p(Yk) 
from  Eq.  (13).  As  before,  only  the  means  and  variances 
need  be  evaluated  for  the  Gaussian  p.d.f.'s. 

(z  ) as  well  as  decide  on  y^  using  the  test 

(10-a) . 

■ tap  4 Obtain  xk2£  and  (k)  from  Eqs.  (27)—  (30)  and  store 
for  later  use  in  the  next  stage. 


Compute 
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atep  5 Set  k ■ k+1  and  go  to  step  2. 

In  contrast  to  the  D-D  scheme,  the  estimator  structure  ie 
independent  of  the  decision  we  make  about  yk  and,  at  the  same  time, 
depends  only  on  the  first-order  statistics  of  y^.  As  a consequence 
of  the  first  fact,  we  expect  the  mean  squared  estimation  error  to 
be  less  for  the  linear  LMSE  scheme  than  for  the  D-D  scheme.  Further, 
the  second  fact  suggests  that,  by  incorporating  the  higher- order 
statistics  of  y^  into  our  estimator  we  may  be  able  to  obtain  even 
a better  estimate.  This  is  the  case  for  the  scheme  to  follow. 


C.  Mean-Squared  Error  Nonlinear  Filter in 


One  may  visualize  the  preceding  scheme  as  one  in  which  the 
parameters  required  in  the  Gaussian  approximation  of  f (x^l zk”^, y^) , 
are  obtained  as  the  solution  of  the  following  problem: 


(2) 

xk|k 

subject  to 


Et  (xk-**|k,TQ(xk-xK|k)) 


(2) 

xk|  k '*'S  a Hnear  function  in  z^  , 


*k  = Vk + \ ■ 

prior  of  x*  - N(.  I ®k,k-lxk-Mk-l  • V<2’ 
prior  of  = Pr{yk=o|  zk_1}»N  (•  | 0,Vv  (k) ) 


+ Pr{yv =l|  * ” j »N  (•  | 0,V  (k)  + V (k)) 


xk,Tbc  are  in<*ePen<aent » given  z * 


{zQ, . . . 


It  is  logical,  therefore,  that  one  way  to  obtain  a better 

(2) 

estimate  for  is  to  relax  the  restriction  that  xkjk  be  in  the 
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class  of  linear  functions  in  z . Hence  by  letting  the  estimate 

v 

of  xk  range  over  all  possible  functions  of  z we  get  an  improved 
estimator  which  we  shall  denote  by  x^?^  and  the  corresponding 
variance  will  be  denoted  by  V'  (k).  The  name  Approximate  Non- 
Gaussian  filter,  which  we  give  to  this  estimator,  originates  from 
the  fact  that  we  make  the  incorrect  assumption  of  a Gaussian  prior 
for  which,  therefore,  results  in  approximate  values  for 
E{xjJz  ) and  Var{xk|z  ).  (As  before,  these  parameters  are  then 
used  in  the  approximation:  ffxjjz *Yk+l>  “ N (x^Jm  (k)  ,V (k) ) .)  The 
derivation  of  the  filter  uses  a result  due  to  Masreliez  [12]  and 
we  give  both  this  result  and  the  derivation  in  the  Appendix.  The 
resulting  estimator  for  the  state  is  then  described  by  the  following 
equations : 

4\l  - + v<3) v on 


v(3)(k)  - [I  - V(3)  (klk-DH^G  (zk)HJc]V(3)  (k|k-l)  (32) 


where 


v(3>(k|k-n  - «k.k-iv'J'n'-1>«iik.1  ♦ °k-ivu(*-l,0i-i'v'',',0)  - v0 

(33) 

g(zk)  - (1“<3)vl1  (zk“Hk*k,k-lxk-l|k-l)  * qV21  (lk“Hk*k,k-lxk-l|k-l) 

(34) 

G (zk)«(l-q)v“1+qV"1 

- (l-q)q[  (V^-Vj1)  <zk-Hjt«k#x-lxk-l|k-l)  (*k‘Hk*k,k-lxk-l|k-l) 


(3) 


r (3  ) 


(V21‘V11)I 


(3*) 


In  the  above  we  used  the  substitution: 


q ^ PrCYk  ■ i|zk) 


V1  * V 3 + Vv00 

V2  = HkV(3)  (k|k-l)H^  + Vy(k)  + Vw(k)  . 

Observe  that  the  equations  specify  an  estimator  which  is  nonlinear 
This  is  a consequence  of  the  observation  noise  being  a Gaussian 
mixture  rather  than  a pure  Gaussian  p.d.f.  Also,  notice  that  the 
filter  structure  now  incorporates  the  higher-order  statistics  of 
{Y]c}  through  q,  and  hence  we  expect  it  to  perform  better  than  the 
previous  schemes.  Based  on  the  above  filter,  we  get  the  following 
approximation  for  f (x^_jJ  zk~^,Yk) 

fVlUk'1'V  • ' V<3)(k-1» 

- f (xk-1|  zk-1#Yk)  » (36) 

and  the  detection  algorithm  becomes: 


Algorithm  3 


step  1 Start  with 


f(x0|z_1,Y0)  = f(xQ)  = N(x0Im0,Vq) 

flZf.lz-Svo)  = f(*0lv0)  - N<Z0'H0IVH0V0H0  + V°>  + W0’’ 


o.  ^feplvo-11 
Lo  Z ' f <*”|v”-0) 

My0-1)  “ P0  * PrtY0-l|*°) 


-20- 


■tcp  2 Assume  we  are  at  stage  k.  Then  use  Eqs.  (36),  (19)  and 


(14)  to  compute  f z*"1^)  » f (XjJ  Ek"1,Yk)  and 

f ^ UjJ  zk_1,Yk)  # respectively.  Also,  determine  p(yJt) 

from  Eq.  (13) . 

(3 ) k 

step  3 Compute  (z  ) as  well  as  decide  on  y^  using  the  test 

(10-a) . 

step  4 Obtain  q from  Eqs.  (18)  and  (17),  and  notice  that  the 
later  reduces  to 

1 - q A pr{vk“0|zk) 


! + ^ e-i (residue)T  (V^-V"1)  (residue) 

Pr(yv«0|zk-1} 


where  the  residue  at  kth  stage  « zfc  - h-lxk-l| k-1 ' 

Next  compute  Xj|j*k  and  V^(k)  from  Eqs.  (31)  - (35), 
and  store  for  later  use  in  the  next  stage. 


step  5 Set  k = k+1  and  go  to  step  2. 


Algorithm  3 as  well  as  the  previous  two  algorithms  are  much 
easier  to  implement  in  comparison  with  the  optimal  detector  des- 
cribed by  Algorithm  0.  To  evaluate  their  performance,  a computer 
simulation  was  performed;  the  results  of  which  are  presented  in 
the  following  section. 


v.  Simulation  and  Results 


A simulation  study  of  the  three  suboptimal  detection  pro- 
cedures  was  performed.  The  system  model  used  is  described  by: 

\ - - -a*k-i + uk-i 
*k  * *k  + vk  + Vk 

where  all  the  vectors  are  one-dimensional  and  have  the  following 
statistics : 

xQ  - N (•  1 1 .0 . 0 . ) 

u^  and  v^  are  N(*|o..l.) 

wk  - N(.  |0..Vw) 

Yk  C { 0 » 1 ) with  transition  probability  matrix 


Here  and  a are  parameters  which  we  varied  in  order  to  get 
different  density  functions  for  the  measurement  noise. 

In  order  to  simulate  the  derived  algorithms,  there  were 
three  major  tasks  to  carry  out.  The  first  was  concerned  with 
generating  the  measurement  sequence  {z^},  which  reduced  to  that 
of  obtaining  the  random  variables  involved.  The  Gaussian  random 
variables  were  generated  using  the  RANORM  subroutine  of  the  IBM-360 
Subroutine  Library,  while  the  Markov  chain.  (y^)»  was  generated 
by  making  use  of  a random  number  generator  together  with  the  trans- 
ition probability  matrix  (as  described  in  [13]). 

The  next  task  was  that  of  implementing  the  detection  schemes. 
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Here,  a Kalman  Filter  was  used  with  appropriate  modifications  for 
each  of  the  three  filtering  procedures. 

The  last  issue  to  be  resolved  was  that  of  evaluating  the 
mean  squared  error  in  estimation  and  the  probability  of  error  in 
detection.  These  two  quantities  were  obtained  using  Monte  Carlo 
methods  which  provided,  at  the  same  time,  information  on  the  proba- 
bility distribution  of  estimation  error  at  different  time  instants. 
Specifically  we  used  the  following  formulas: 
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Remark:  We  used  a value  of  N ■ 3000  as  it  proved  to  give  suffi- 
ciently smooth  curves  without  requiring  excessive  computer  time. 

The  simulation  study  proceeded  in  two  main  directions.  The 
first  was  to  compare  the  performance  of  the  three  detection  schemes 
for  a specific  system  and  under  identical  noise  statistics.  For 
this  purpose  two  performance  criteria  were  used;  the  mean-squared- 
error  in  estimation  and  the  probability  of  error  in  detection. 

The  second  direction  was  to  evaluate  the  performance  of  each  scheme 
individually  for  various  measurement  noise  distributions;  in  other 
words  evaluating  its  sensitivity. 

Results 

Fig.  1(a)  shows  the  mean-squared-error  (MSE)  in  estimation 
for  the  Decision-Directed  (DD) , the  Linear  Least-Mean  Squared 
Error  (LLMSE)  and  the  Approximate  Non-Gaussian  (ANG)  filters, 
plotted  against  time.  The  variance  of  the  noise  sequence  {w^J 
was  chosen  equal  to  the  constant  value  10  and  the  MC  had  the 
transition  probability  matrix,  P = 0*5]'  corresponds 

to  a switching  sequence  of  i.i.d.  r.v.'s.  We  observe  that 

the  (ANG)  filter  performs  uniformly  better  than  the  (LLMSE) 
filter  and  the  latter  is,  in  turn,  uniformly  better  than  the 
(DD)  filter.  More  specifically,  at  k*21  the  three  filters  have, 
respectively,  a MSE  of  1.32,  1.47  and  1.57.  In  Fig.  1(b),  we 
plotted  the  performance  of  the  three  detectors,  as  measured  by 
the  probability  of  error.  Here  we  observe  their  performance  to 
be  surprisingly  similar  to  one  another,  despite  their  differences 
in  state  estimation.  Thus,  the  value  of  the  error  probability 
at  k*21,  is  approximately  35%  for  all  three  schemes. 
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Figs.  2(a,b)  show  the  performance  in  estimation  and  detection 
for  the  three  schemes  with  a different  choice  of  the  transition 
probability  matrix;  P * [:S  . Obviously  our  switching  sequence 

in  this  case  is  no  longer  i.i.d.  but  rather,  a highly  dependent 
one.  Nevertheless,  we  obtain  a performance  similar  to  the  pre- 
vious case  with  a MSE  value  of  1.35  for  the  (ANG)  filter  at 
k=21  and  a probability  of  error  of  44%  at  the  same  instant.  Thus, 
the  dependencies  in  £ Y^)  seem  to  bring  about  an  increase  in  the 
MSE  as  well  as  in  the  probability  of  error.  It  also  resulted 
in  oscillatory  transients  that  lasted  for  10  time  steps  in  the 
MSE  curve  and  for  16  time  steps  in  the  probability  of  error  curve. 

We  next  examined  the  effect  of  on  the  performance  by 
changing  Vw  to  1.0  and  keeping  P at^*^  (i.i.d.  case).  The 

results  obtained  are  depicted  in  Figs.  3(a,b).  We  notice  that 
the  relative  performance  of  the  three  schemes  is  similar  to  the 
case  of  Vw=l0.0,  and  that  the  effect  of  reducing  was  to  reduce 
the  MSE  to  .73  for  the  (ANG)  filter  and  to  increase  the  probability 
of  error  (to  45%  in  all  three  detectors).  Furthermore,  therp^ is  no 
appreciable  difference  in  the  MSE  for  both  the  (ANG)  and  the  (LLMSE) 
filters,  while  the  (DD)  filter  has  a MSE  which  is  higher  by  only 
.03. 

We  now  turn  to  the  sensitivity  analysis  for  each  scheme  w.r.t. 
the  parameters  a and  V . The  results  are  shown  in  Figs.  4,5  and 
6 which  give  both  the  MSE  and  the  probability  of  error  for  the 
(DD),  the  (LLMSE)  and  the  (ANG)  procedures,  respectively.  As- 
suming a symmetric  transition  probability  matrix,  P * 
we  increased  a from  0.5  to  0.9  and  then  to  .99.  The  effect  was 
as  follows: 


t 
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a)  for  the  (DD)  scheme,  the  MSE  increased  with  the  increase  in 
a.  Moreover,  oscillatory  transients  were  observed  for  a*. 9 
and  these  oscillations  persisted  for  a».99.  The  probability 

of  error  also  increased  as  a increased  with  sustained  oscilla- 

# 

tions  for  a*. 99. 

b)  for  the  (LLMSE)  scheme,  we  have  a similar  dependency  on  a as 
in  the  previous  scheme.  However,  the  MSE  for  o».99  oscillates 
about  an  average  value  which  is  approximately  the  MSE  for 

a**. 9,  (this  bias  was  larger  for  the  (DD)  scheme),  and  these 
oscillations  have  smaller  amplitude.  Further,  the  probability 
of  error  did  increase  as  a increased,  with  that  of  a«.99 
dominating  the  probability  of  error  for  a=.9  (this  is  different 
from  the  (DD)  case). 

c)  for  the  (ANG)  scheme,  the  general  features  are  similar  to  the 
previous  two  schemes,  but  with  the  following  distinctions: 
the  MSE  for  a=.99  oscillates  about  an  average  value  which  is 
less  than  the  MSE  at  a*. 9,  and  both  the  oscillations  in  the 
MSE  and  the  probability  of  error  have  amplitudes  that  are 
larger  than  those  of  the  (LLMSE)  detection  scheme. 

Finally,  we  investigated  the  effect  of  V , by  reducing  its 
value  from  10.0  to  1.0,  while  keeping  a fixed  at  0.5.  The  re- 
sulting MSE's  and  probability  of  error's  are  shown  in  Figs.  4-6, 
in  which  we  observe  a common  property;  as  decreased  the  MSE 
also  decreased  and  the  probability  of  error  in  detecting  in- 
creased. We  will  give  an  explanation  for  this  behavior  as  well 
as  other  observations  in  the  next  section. 


t 
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We  shall  attempt  to  explain  some  of  the  observations  made 
in  the  previous  section,  starting  with  the  relative  performance 
of  the  three  procedures.  The  results  showed  the  (ANG)  estimator 
outperformed  the  (LLMSE)  estimator  and  the  latter  had  less  MSE 
than  the  (DD)  scheme.  This  comes  as  no  surprise,  especially  in 
view  of  our  introduction  to  the  (ANG)  filter  in  Section  IV.  It 
was  mentioned  that  the  (ANG)  filter  is  optimal  in  the  MSE  sense, 
provided  the  p.d.f.  f(XjJz*-1)  is  Gaussian,  an  assumption  that 
seems  to  hold  as  indicated  by  Pigs.  7 (a,b) . On  the  other  hand, 
the  (DD)  estimator  assumes  each  decision  we  make  about  y^  to  be 
correct  and  determines  the  Kalman  filter  gain  accordingly.  Since 
any  detector  has  a nonzero  probability  of  error  and  in  the  case 
of  the  (DD)  scheme  there  is  an  interaction  between  the  estimator 
and  the  detector,  then  we  expect  incorrect  decisions  to  propagate, 
resulting  in  a degradation  of  the  filter  performance. 

The  second  observation  is  that,  despite  the  discrepancy  be- 
tween the  MSE  of  the  three  schemes,  the  probability  of  error  is, 
nevertheless,  practically  the  same.  This  suggests  that  the  use 
of  a MSE  criterion  for  the  estimator  may  not  give  the  best  overall 
detector  and  appears  to  be  consistent  with  previously  reported 
results  [14] . 

Next  we  explain  the  effect  of  changing  a.  At  a*. 5,  the 
switching  sequence  (y^)  is  i.i.d.  and  hence  the  prior  probability 
for  Yy  is  independent  of  the  measurements  { zQ, . . . , z^_^} , for 
each  k.  This  eliminates  one  source  of  error,  namely  the  estima- 
tion  of  p(YjJ*  ).  Another  consequence  of  independence  is  that 
the  state  xfc  and  the  measurement  noise  ■ v^  + y^wk  become  con- 
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ditionally  independent,  a requirement  needed  for  Masreliez's 
theorem  to  hold  (see  the  derivation  in  the  Appendix  and  Ref. 
[12].  Due  to  these  two  factors,  we  expect  less  MSE  for  a=0 . 5 
than  for  a=0.9  or  .99.  Though  this  seems  to  hold  for  the  (DD) 
and  the  (ANG)  filters,  the  (LLMSE)  filter  departs  from  this  con- 
clusion. For  a=0.5,  we  also  found  the  probability  of  error  to 


be  the  lowest  for  all  three  schemes.  This  can  be  explained  in 
terms  of  the  higher  accuracy  in  xk|k  and  the  more  accurate  values 
for  p(yk)zk~1). 

Finally,  we  observed  that  by  decreasing  V , the  MSE  decreased 
while  the  probability  of  error  increased.  This  is  expected  since 

z^),  and  hence  reducing 

the  uncertainty  in  the  measurement  noise,  reduces  the  uncertainty 

A 

in  *k|k  and  therefore  its  variance.  On  the  other  hand,  we  have 
zk  = H^x^  + vk  + YkWk  an<*  as  °f  wk  decreases  so  does 

the  effective  S/N  for  the  detection  of  Consequently,  the 

probability  of  error  increases. 


in  general  x. 


function  of  [zn,.... 
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VII.  Summary  and  Conclusion! 

The  problem  of  sequential  detection  in  a switching  environ- 
ment has  been  investigated.  Using  the  Markov  property  of  the 
switching  sequence,  and  applying  Bayes1  rule  we  derived  a 

recursive  structure  for  the  Bayesian  optimal  detector.  The  de- 
tector obtained  gives  a rule  for  deciding  on  the  true  hypothesis 
in  a situation  where  the  underlying  hypothesis  switches  from  one 
stage  to  another,  according  to  a state  transition  probability  matrix. 

Since  the  actual  implementation  of  the  optimal  detector  re- 
quires numerical  integration  of  p.d.f.'s,  there  is  an  obvious  need 
for  a more  practical  procedure.  We  undertook  this  task  by  approxi- 
mating the  prior  p.d.f.  for  the  state  with  a Gaussian  density  whose 
mean  and  variance  are  computed  recursively.  The  three  subopt imal 
schemes  that  we  proposed,  namely?  the  decision-directed  procedure, 
the  linear  LMSE  procedure  and  the  approximate  non-Gaussian  procedure, 
were  shown  to  be  much  simpler  and  easier  to  implement  than  the  op- 
timal counterpart.  Moreover,  the  simulation  study  showed  the  deci- 
sion-directed approach  to  be  the  least  satisfactory  while  the  ap- 
proximate non-Gaussian  was  the  most  accurate. 

An  immediate  application  of  our  results  is  in  "target  tracking 
in  a multi-target  envionment. " Thus,  if  a sensor  is  tracking  two 
targets  with  the  same  state  equations  but  different  observation 
models,  then  our  results  provide  a procedure  for  sequentially  dis- 
tinguishing the  returns  of  one  target  from  the  other.  These  assump- 
tions can  easily  be  extended  to  more  complicated  situations.  For 
instance,  if  the  two  targets  have  different  state  equations,  thus 
making  the  model  more  general,  we  can  readily  modify  our  results 
by  using  the  appropriate  p.d.f.'s.  Further,  when  there  are  more 
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than  two  targets,  the  derivation  can  be  further  modified  by  letting 
Yk  assume  values  in  the  set  {0, 1, . . . ,K-l) , with  K the  number  of 
targets  under  consideration.  In  this  case,  the  detection  problem 
becomes  a multiple  hypothesis  problem  with  slightly  more  compli- 
cated equations.  Finally,  since  in  practice  we  may  be  interested 
in  optimizing  both  the  decision  and  state  estimate,  we  may  do  so 
by  assigning  costs  to  each  aspect  of  the  problem  when  initially 
formulating  it.  Such  an  approach  was  first  proposed  by  Middleton 
and  Esposito,  [15],  and  is  expected  to  yield  a detector-estimator 
structure  which  is  a compromise  between  the  optimal  estimator  of 
Ackerson  and  Fu  and  our  optimal  detector. 
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We  now  consider  the  assumptions  and  conclusions  of  the 

above  theorem  and  see  how  they  apply  to  our  problem: 

1  - Masreliez's  theorem  gives  an  expression  for  the  conditional 

1. 

mean  estimate,  E{x^|z  },  which  is  at  the  same  time  the  solution 
to  the  minimum  mean-squared  error  problem: 


A T A 

rain  E{  (x^-x^)  Q (x^-x^) | z^J  , Q is  p.d. 
xk 


subject  to 

V**'**'11  - H<*kl  VV  • 

(\l  zk_1 ) - is  an  arbitrary  p.d.f.  , 

x^  and  are  conditionally  independent,  and 
lc— 1 

f (zv|z  ) is  twice  differentiable. 

Z K 

This  can  be  easily  seen  by  expanding  the  quadratic  term  in  the 

A 

cost  function  and  differentiating  w.r.t.  x^. 

2 - By  making  the  assumption,  as  we  already  did  in  our  problem, 
that  f(xk_1|zk  1)  = N (xjc_ 1 1 Xk_i|  k-i» v Oc— 1 ) ) , it  follows  that 

f(xk|zk_1)  = N^X)CI#)C  k-lxk-l|k-l,V^klk“1^  * the  re~ 

quirement  of  the  theorem  that  the  prior  of  x^  be  Gaussian  is 
satisifed. 

3 - If  we  can  show  that  xk  and  are  conditionally  independent, 
and  bearing  in  mind  that  zk  is  actually  a Gaussian  mixture  and 
hence  is  twice  differentiable,  then  the  remaining  requirements 
of  the  theorem  hold  and  we  can  apply  it  to  our  problem. 

Let  us  now  check  this  last  step. 


In  the  following,  we  shall  state  a theorem  due  to  Masreliez 
[12],  which  is  the  basis  for  the  derivation  of  the  approximate 
non~Caussian  filter.  We  will  then  compare  the  conditions  of  the 
theorem  with  the  assumptions  for  our  problem  and  obtain  the  re- 
sulting filter  equations. 

Masreliez  Theorem 


where 


zk  = v* + \ ■ 

V**12**1’  * N(xitlvV  • 

f (r^| zk_1)  is  arbitrary  p.d.f., 

Xy  and  are  conditionally  independent. 


f(zR|zk  ) ^ J*  )fx(xklzk  ) 


is  twice  differentiable. 


Then, 


where 


where 


i,  4 


EtxklskJ 

**  + Mk>fc,V  • 


-df  (z.  |zk‘1)/d: 
f2(zkUk“1) 


Pv  ^ E{  (xk-xk)  (xk-xk)T|  zv) 


\ - W(zk)Hk\ 


&g  (zv> 

G(zk)  " Tt*-  • 


(A-l) 


(A-2) 


By  definition  ■ vk  + Yk**k.  Hence: 

fWkV%l*k'1'  - f<Vk-vk  + V'kl*k'1> 

- £(vk  + Vkwk|z’t‘1)*(HkxkUk-1,vk  + ykwk)  . 

The  last  term  on  the  R.H.S.  should  equal  f (Hj^XjJ  zk-i)  for  con- 
ditional independence.  We  can  write  it  as: 

£,Hkxk|zk'1'vk  + Vk> 

f (Hj^x^ , zk-1 , vk  + Ykwk ) 

ffak'1-vk  + Vk’ 

f (Hj^x^ , z*-1 , vk  , Y]c=0 ) +f  (H^X^  , zk~* , vk+wk « Vk=l ) 

f (zk_1*  vk,Yk=0)+f  (zk_1,  v]c+wJc » Yk=l ) 


f (vk)PrCYk=03f  (Hkxk>zk~1 1 vk=0 ) +f  ( vk+wk ) Pr{  vk«l ) f tt^x  . z k_1 1 Yk=l ) 
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k-1 

in  Hj^x^  and  z appearing  in  the  numerator  and  denominator 
are  equal,  then  we  have  conditional  independence.  Therefore, 
in  applying  the  theorem  to  our  problem  we  make  an  error  which 
depends  on  how  closely  the  above  conditions  are  satisfied. 

The  Approximate  Non-Gaussian  Filter  for  - vk  + 

An  inspection  of  Eqs.  (A— 1 ) and  (A-2)  shows  that  the  filter 
is  determined  once  the  p.d.f.  f(zk|z  ) and  its  derivatives  are 
evaluated.  So  we  begin  with  f(zk|zk“*). 

zk  - Vk  + \ • 

where 

f(xk|zk  X)  - N(xkl*k,k-lxk-l|k-l  * 

and 

fftijz*’1)  - f(vk  + Ykwk|zk‘1) 

= (l-p).N(r»kiO,Vv(k))  + P‘N(T>k|0,Vv(k)-tVw(k)) 

where  we  substituted  p for  Pr(vk"l|*  )•  It  follows,  therefore, 
that 


f (zk|  zk"1 ) 


- <i-P>-»<«kiv*.k-A-iik-i  • v,kik-ll>£ + vk>> 

+ p*"«siVk.k-A-iik-x  • Hkv,k'k-1)Hi  * vk>  * "w01" 


-I  (z.-u)  “4  > A V“A  (zv-u) 

(1-p)5 — - — K — + p * — * — * — 


T..-1 


(2tt)‘ 


On*  |v,|*  (A-3> 


Here  we  used  the  substitutions 

M “ **k^k,k-lxk-l|  k-1 
V1  " **jcv(k|k-l)H^  ♦ Vy(k) 

V2  - Hj^V  (k|  k-l)H^  «■  Vy(k)  «•  Vw(k) 


-41- 


By  differentiating  Eq.  (A-3)  w.r.t.  zk  we  get,  after  some 
algebraic  manipulations! 


i - 


f (zvl 


_k-l 


) 


(l-qjv"1^-*!)  +qV21(zk-u) 


(A-4) 


where 


I Vx|*  . 


"p  kj1 


itlUk-MllJ-i  - llvu||2v;i) 


1 + ' *-r  e_^zk'wNv"1  " Hv^vT1! 

1_p  Kl*  2 1 


(A- 5) 


Next  we  differentiate  g(zk>  w.r.t.  ±k  in  order  to  get  G(zk) 


* 5g(z.) 

»«V  4rr^ 

dzk 


(l-qJV"1  + qV”1 


- (l-q)qt  (V^-V"1)  (zk-u)  (zk-u)T  (V^-V"1 ) ] 


(A- 6) 


While  p ■ Pr(Yv“l| zk_1)  is  the  prior  probability  of  yv  given  zk_1. 


q is  in  fact  the  posterior  probability  of  yk  given  zK.  This  is 
seen  as  follows: 


k,  Prtyk.l|zk-1)f(z.|zk-1.yk.l) 


Pr{yk-lJzK) 


Pr{  yk«0 1 z^1 ) f (zk  | zK-A , yk-0 ) +Pr{  yk«l  | zK_i  } f (zk  | zK-A , yk«l ) 


k-1 


k-1- 


k-1 


f(zk|zk~1,yk-l) 


X’P  f (z.  | -k“1 


,vjL-°) 


1 + 


f(zk|zk-1,Yfc-l) 


1-p  f (zk| zk_1,yk-0) 

.-it||zk-u||y-i-||zk-w||*-ij 

2 |v,|* 


1 + 

- q 


1-p  Iv,| 


e-i  [||zk-u||v-i-||zk-ti||v-ij 


not 


ing  that  the  filter  equations 
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